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I.  INTRODUCTION 


The  ability  to  communicate  securely  has  been  a  challenge  for  millennia. 
For  as  long  as  people  have  tried  to  exchange  private  information,  others  have 
tried  to  compromise  their  privacy.  In  the  modern  communications  environment, 
radio  frequency  communications  and  worldwide  digital  networks,  such  as  the 
Internet,  compound  the  problem.  Both  are  susceptible  to  eavesdropping,  often 
times  trivially.  By  simply  placing  an  antenna  in  the  region  of  a  radio  frequency 
broadcast  or  tapping  a  wire  anywhere  between  two  nodes  on  a  digital  network, 
an  uninvited  third  party  can  easily  gain  access  to  seemingly  private 
correspondence.  The  field  of  cryptography — the  practice  and  study  of  hiding 
information — has  made  enormous  progress  combating  the  eavesdropping  threat. 

Cryptography  and  encryption/decryption  methods  fall  into  two  broad 
categories:  symmetric  and  public  key.  In  symmetric  cryptography,  sometimes 
called  classical  cryptography,  parties  share  the  same  encryption/decryption  key. 
Therefore,  before  using  a  symmetric  cryptography  system,  the  users  must 
somehow  come  to  an  agreement  on  a  key  to  use.  An  obvious  problem  arises 
when  the  parties  are  separated  by  large  distances,  which  is  commonplace  in 
today’s  worldwide  digital  communications.  If  the  parties  did  not  meet  prior  to 
their  separation,  how  do  they  agree  on  the  common  key  to  use  in  their  crypto 
system  without  a  secure  channel?  They  could  send  a  trusted  courier  to 
exchange  keys,  but  that  is  not  feasible,  if  time  is  a  critical  factor  in  their 
communication. 

The  problem  of  securely  distributing  keys  used  in  symmetric  ciphers  has 
challenged  cryptographers  for  hundreds  of  years.  If  an  unauthorized  user  gains 
access  to  the  key,  the  cryptographic  communication  must  be  considered  broken. 
Amazingly,  in  1977,  Whitfield  Diffie  and  Martin  Heilman  published  a  paper  in 
which  they  presented  a  key  exchange  protocol  that  provided  the  first  practical 
solution  to  this  dilemma.  The  protocol,  named  the  Diffie-Hellman  key  exchange 

(or  key  agreement)  protocol  in  their  honor,  allows  two  parties  to  derive  a  common 
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secret  key  by  communications  over  an  unsecured  channel,  while  sharing  no 
secret  keying  material  a  priori  [2],  While  Diffie  and  Heilman  have  received 
recognition  for  creating  the  protocol,  it  later  emerged  that  the  Government 
Communications  Headquarters  (GCHQ),  a  British  intelligence  agency,  had 
independently  invented  a  similar  protocol  a  few  years  before  Diffie  and  Heilman 
published  their  breakthrough  paper.  However,  the  British  government  classified 
their  findings  and  the  results  were  not  released  to  the  public  until  1997  [3], 

The  Diffie-Hellman  protocol  relies  on  the  difficulty  of  solving  discrete 
logarithms  in  finite  fields  and  the  related  intractability  of  the  Diffie-Hellman 
problem.  Due  to  the  difficulty  of  solving  these  mathematical  problems,  an 
eavesdropper  is  unable  to  compute  efficiently  the  secret  key  with  any  or  all  of  the 
information  intercepted  in  the  open  communication  channel.  Once  the  secret  key 
has  been  exchanged  successfully  between  the  two  parties,  they  may  proceed  by 
using  the  key  in  their  symmetric  crypto  system. 

Before  conducting  the  key  exchange  using  the  Diffie-Hellman  protocol,  the 
parties  must  agree  on  a  prime  number  that  defines  the  mathematical 
environment  in  which  the  key  exchange  will  take  place.  If  the  prime  number  is 
large  enough,  a  brute  force  attack  to  find  the  secret  key  becomes  infeasible. 
However,  if  the  two  parties  agree  on  certain  prime  numbers,  an  active  adversary 
can  compromise  their  communication.  Using  number  theory,  a  man-in-the- 
middle  attack  becomes  possible  if  the  prime  number  that  defines  the  environment 
can  be  broken  down  into  the  form  of  p  =  Rq  + 1,  where  R  is  a  “small”  integer  and 
q  is  a  “large”  prime.  If  possible,  the  attacker  can  then  modify  the  messages 
between  the  two  parties  so  that  they  will  both  derive  a  key  that  belongs  to  a 
subgroup  of  size  R  .  If  R  is  small  enough,  the  attacker  can  search  the  keyspace 
in  a  reasonable  amount  of  time,  determine  the  key  the  parties  agreed  to,  and 
eavesdrop  on  their  communication. 
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This  thesis  investigates  the  Diffie-Hellman  protocol  and  the  difficulty  of  the 
discrete  logarithm  problem  the  protocol  relies  on.  We  then  analyze  the  man-in¬ 
middle  attack  described  above  by  developing  an  algorithm  to  conduct  the  attack, 
estimate  the  complexity  involved  in  executing  the  attack,  and  approximate  the 
amount  of  prime  numbers  that  are  vulnerable.  We  then  consider  several 
proposed  methods  to  defend  against  the  attack  and  demonstrate  their 
effectiveness.  Finally,  we  extend  the  attack  to  several  multi-party  variants  of  the 
protocol  and  demonstrate  their  potential  vulnerability. 
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II.  BACKGROUND  AND  REVIEW 


Before  beginning  a  discussion  of  the  Diffie-Hellman  protocol  and  the  man- 
in-the-middle  attack,  we  investigate  and  present  some  basic  definitions  and 
theorems.  This  information  is  available  in  any  standard  algebra  text,  such  as 
Fraleigh’s  Abstract  Algebra  [4],  or  discrete  mathematics  text,  such  as  Rosen’s 
Discrete  Mathematics  and  Its  Applications  [5],  It  is  assumed  the  reader  is 
familiar  with  common  mathematical,  logical,  and  set  notation. 

We  conclude  the  chapter  with  a  brief  discussion  of  computational 
complexity  and  primality  testing,  which  will  be  useful  in  our  analysis  of  the  man- 
in-the-middle  attack. 

A.  NUMBER  THEORY 

If  a  and  b  are  integers  and  a  *  0 ,  we  say  that  a  divides  b  if  there  is  an 
integer  c  such  that  b  =  ac.  When  a  divides  b  we  say  that  a  is  a  factor  of  b  and 
that  b  is  a  multiple  of  a.  The  notation  a\b  denotes  a  divides  b.  Given  two 

integers  a  and  b ,  both  non-zero,  the  largest  integer  d  such  that  d\a  and  d\b  is 

called  the  greatest  common  divisor  of  a  and  b  .  The  greatest  common  divisor  of 
a  and  b  is  denoted  by  gcd (a,b) .  The  integers  a  and  b  are  relatively  prime,  if 
their  greatest  common  divisor  is  one. 

Every  positive  integer  greater  than  one  is  divisible  by  at  least  two  integers, 
itself  and  one.  If  these  are  its  only  factors,  we  call  this  integer  prime.  A  positive 
integer  that  is  greater  than  one,  and  not  prime,  is  called  composite.  The  primes 
are  the  building  blocks  of  positive  integers.  The  Fundamental  Theorem  of 
Arithmetic  states  that  every  positive  integer  greater  than  one  can  be  written 
uniquely  as  a  product  of  two  of  more  primes,  where  the  prime  factors  are  written 
in  order  of  nondecreasing  size.  Given  a  positive  integer,  n ,  let  the  prime 
factorization  of  n  be  denoted  by 


5 


n=t[pi‘ 

i= 1 

In  some  situations,  we  care  only  about  the  remainder  of  an  integer  when  it 
is  divided  by  some  specified  positive  integer,  denoted  by  m.  If  a  and  b  are 
integers,  then  a  is  congruent  to  b  modulo  m  if  m  divides  a-b.  We  use  the 
notation  a  =  b  (mod/«)  to  indicate  that  a  is  congruent  to  b  modulo  m.  Note  that 
a  =  b  (mod/7?)  if  and  only  if  a(modm)  -b(modm) .  Also,  if  n  divides  a  then  a  is 
congruent  to  zero  modulo  n . 

The  great  French  mathematician  Pierre  de  Fermat  (1601-1655) 
demonstrated  that  the  congruence 

ap~x  =  1  (mod  p ) 

holds  when  p  is  a  prime,  and  this  gives  us  a  theorem  that  will  prove  crucial  in 
our  analysis  of  the  man-in-the-middle  attack. 

Fermat’s  Theorem  [4]:  If  a  e  Z  and  p  is  a  prime  not  dividing  a,  then  p 
divides  ap~l  -  \,  that  is,  ap  1  =  1  (mod  p) . 


Euler  gave  a  generalization  of  Fermat’s  theorem,  but  we  must  first  define 
Euler’s  Totient  Function.  Commonly  referred  to  as  Euler’s  Phi  Function,  the 
function  gives  the  number  of  integers  less  than  or  equal  to  n  which  are  relatively 

k 

prime  to  n ,  and  is  denoted  by  <j>(n) .  It  is  not  hard  to  show  that,  if  n  =  F[  p“‘  ,  then 

1=1 


</>(n)  =  n\\ 

<= i  V 


k  f  i  A 

1 - 


'ij 


Euler’s  Theorem  [4]:  If  a  e  Z  and  is  relatively  prime  to  n ,  then  a'Hn)  -1  is 
divisible  by  n ,  that  is,  a^n)  =  1  (mod  n) . 


In  several  cases,  this  thesis  will  involve  systems  of  linear  congruences. 
The  Chinese  Remainder  Theorem  [CRT],  named  after  the  Chinese  heritage  of 
problems  involving  systems  of  linear  congruences,  states  that  when  the  moduli  of 
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a  system  of  linear  congruences  are  pairwise  relatively  prime,  there  is  a  unique 
solution  of  the  system  modulo  the  product  of  the  moduli. 

[CRT]  [5]:  Let  ml,m2,...,mn  be  pairwise  relatively  prime  positive  integers  and 
al,a2,...,an  arbitrary  integers.  Then  the  system 

x  =  a,  (mod  m] ), 
x  =  a2(mod  m2 ), 

< 

x  =  ajmod  mn) 

has  a  unique  solution  modulo  m  =  mxm2...mn.  (That  is,  there  is  a  solution  x  with 
0  <  x  <  m ,  and  all  other  solutions  are  congruent  modulo  m  to  this  solution.) 

B.  GROUP  THEORY 

A  group  (G,*)  is  a  set  G,  closed  under  a  binary  operation  *,  such  that 
the  following  axioms  are  satisfied: 

Associativity:  For  all  a,b,c&G,  ( a*b)*c  =  a*(b*c ) 

Identity:  There  is  an  element  e  in  G  such  that  for  all  xeG, 

g*x  =  x*e  =  x. 

Inverse:  Corresponding  to  each  aeG,  there  is  an  element  a'  in  G  such 
that  a*a'  =  a'*a  =  e . 

A  group  that  also  satisfies  the  commutative  property  is  referred  to  as  an  abelian 
(or  commutative)  group. 

Commutativity:  For  all  a,b&G,  a*b  =  b*a. 

A  group  G  is  said  to  be  a  finite  group,  if  the  set  G  has  a  finite  number  of 
elements.  In  this  case,  the  number  of  elements  is  called  the  order  of  G, 
denoted  by  \  G\.  This  thesis  is  interested  only  in  finite  groups. 
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If  a  subset  H  of  a  group  G  is  closed  under  the  binary  operation  of  G  and 
if  H  with  the  induced  operation  from  G  is  itself  a  group,  then  H  is  a  subgroup  of 
G  .  We  shall  let  H  <  G  or  G  >  H  mean  that  H  is  a  subgroup  of  G  ,  and  H  <  G 
or  G  >  H  shall  mean  H  <  G  but  H  *  G . 

An  example  of  a  group  is  the  set  of  congruence  classes  of  the  integers 
modulo  n .  Given  a  positive  integer  n ,  we  denote  a  congruence  class  by  \a\n 

which  is  the  set  of  all  integers  congruent  to  a  modulo  n  .  The  set  of  congruence 
classes  of  n  is  denoted  by 

M[°UT--T-2]>-iU 

This  set  forms  a  group  under  addition  where  [«]  +[&]  =[a  +  b\n  ar|d  is  denoted 

by  (Zn,+).  We  can  easily  inspect  a  group  using  a  group  table.  Table  1  is  a 

group  table  for  Z5  under  addition.  The  elements  of  Z5  are  the  column  and  row 

headings,,  with  the  binary  operation  (addition  in  this  case),  in  the  upper  left 
corner. 


+ 

0 

1 

2 

3 

4 

0 

0 

1 

2 

3 

4 

1 

1 

2 

3 

4 

0 

2 

2 

3 

4 

0 

1 

3 

3 

4 

0 

1 

2 

4 

4 

0 

1 

2 

3 

Table  1 .  Group  Table  for  (Z5,+) 


If  n  is  a  prime  p ,  then  the  set  Z*p  =Zp  — 1[0]  |  forms  a  group  under 

multiplication  modulo  n  .  It  is  a  necessary  requirement  to  remove  the  zero  class 
because  zero  has  no  inverse  under  multiplication,  (z*,,-),  the  multiplicative 
group  of  the  set  of  congruence  classes  of  prime  integers,  is  the  structure  we  will 
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be  focusing  on  in  this  thesis.  The  Diffie-Hellman  key  exchange  protocol  sets  this 
group  as  the  environment  for  the  key  agreement.  If  we  remove  the  zero  element 
from  the  previous  example,  we  have  another  group  table  (Table  2),  this  time  with 
multiplication  as  the  binary  operation. 


• 

1 

2 

3 

4 

1 

1 

2 

3 

4 

2 

2 

4 

1 

3 

3 

3 

1 

4 

2 

4 

4 

3 

2 

1 

Table  2.  Group  Table  for  (z* 

Let  G  be  a  group  and  let  a  e  G .  Then  the  subgroup  { a"  \n  e  zj  of  G  is 
called  the  cyclic  subgroup  of  G  generated  by  a  ,  and  is  denoted  by  ( a ) .  Further, 
a  generates  G  if  (a)  =  G .  A  group  G  is  cyclic  if  there  is  some  element  a  in  G 
that  generates  G . 

The  group  is  always  cyclic.  An  important  property  of  cyclic  groups 

is  that  every  subgroup  of  a  cyclic  group  is  also  cyclic.  Another  important  property 
of  groups  in  general  is  the  Theorem  of  Lagrange. 

Lagrange’s  Theorem  [4],  Let  H  be  a  subgroup  of  a  finite  group  G . 
Then  the  order  of  H  is  a  divisor  of  the  order  of  G . 

This  powerful  theorem  makes  the  attack  we  will  analyze  later  possible. 
We  know  the  order  of  (z *p ,■)  is  p- 1 .  The  two  properties  mentioned  above  tell 

us  that  any  subgroup  of  will  also  be  cyclic  and  the  order  of  the  subgroup 

will  be  a  divisor  of  p- 1 . 
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C.  FIELD  THEORY 

A  field  (F,+,-),  is  a  set  F  together  with  two  binary  operations,  which  we 

will  call  addition  and  multiplication,  defined  on  F  such  that  the  following  axioms 
are  satisfied: 

Addition:  (F,+)  is  an  abelian  group. 

Multiplication:  (f*  ,■)  is  an  abelian  group. 

Distributive:  For  all  a,b,c<=F,  a-(b  +  c)  =  (a-b)  +  (a-c) . 

A  field  F  is  said  to  be  a  finite  field,  if  the  set  F  has  a  finite  number  of  elements. 
If  F  is  a  finite  field,  then  the  multiplicative  group  is  cyclic. 

For  every  prime  p  and  positive  integer  n ,  there  is  exactly  one  finite  field 
(up  to  isomorphism)  of  order  pn .  This  field  GF(p")  is  usually  referred  to  as  the 
Galois  field  of  order  p" .  Oftentimes,  the  Diffie-Hellman  key  exchange  protocol 
is  described  using  the  environment  GF(p)  instead  of  the  group  Z*  .  In  the  group 

theory  section,  we  described  the  notion  of  a  generator  of  a  cyclic  group.  In  field 
theory,  specifically  in  GF(p) ,  the  same  element  that  will  generate  the  entire 
multiplicative  group  is  known  as  a  primitive  root.  The  number  of  primitive  roots  of 
afield  GF(p )  is  (/>{</){p))  =  </>(p - 1) . 

D.  COMPUTATIONAL  COMPLEXITY 

Before  the  discussion  of  primality  testing,  it  is  important  to  understand 
what  makes  one  test  more  efficient  than  another.  Computational  complexity 
involves  the  study  of  the  efficiency  of  algorithms  based  on  the  time  and  memory 
space  required  to  solve  a  problem  of  a  particular  size  [5],  Usually,  complexities 
are  expressed  using  the  Big-O  notation. 
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Definition  [5]:  Let  /  and  g  be  functions  from  the  set  of  integers  or  the 
set  of  real  numbers  to  the  set  of  real  numbers.  We  say  that  /(x)  is  0(g(x ))  if 
there  are  constants  C  and  k  such  that 

|/(x)|<C|g(x)| 

Whenever  x>k.  [This  is  read  as  “  /(x)  is  big-oh  of  g(x) .”] 

This  notation  is  extremely  helpful  when  comparing  algorithms,  such  as  the 
primality  tests  we  will  discuss.  We  will  use  the  Big-0  notation  as  an  upper  bound 
on  the  amount  of  operations  a  test  will  require.  In  general,  the  smaller  the  upper 
bound,  the  more  efficient  the  test  is.  The  more  efficient  the  test  is,  the  quicker  it 
can  complete  the  required  steps  of  an  algorithm  and  give  an  answer.  Thus, 
using  the  Big-0  notation,  we  can  often  quickly  decide  which  test  will  finish 
soonest,  using  fewer  resources  and  less  computer  time. 

The  most  commonly  used  functions  in  Big-0  notation  are: 

1,  log  n,  n,  n  log  n,  n2 , 2" ,  n ! 

It  is  shown  that  each  function  in  the  list  is  smaller  than  the  succeeding 
function  as  n  grows  without  bound  [5],  Figure  1  demonstrates  this  fact. 
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Figure  1 .  Growth  of  Functions  Used  in  Big-0  Estimates  [From  5] 

Notice  the  vertical  axis  scale  is  logarithmic,  doubling  each  unit.  This 
causes  the  exponential  function  2"  to  appear  as  a  straight  line. 

An  algorithm  that  is  Big-0  of  a  constant  has  constant  complexity.  An 
algorithm  that  is  Big-0  of  a  logarithm  has  logarithmic  complexity,  and  so  on. 
Table  3  displays  the  common  terminology  used  to  describe  the  time  complexity 
of  algorithm. 
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Complexity 

Terminology 

0(1) 

Constant  complexity 

0(log«) 

Logarithmic  complexity 

0(n) 

Linear  complexity 

0(nb) 

Polynomial  complexity 

0(b" ) 

Exponential  complexity 

0(n !) 

Factorial  complexity 

Table  3.  Computational  Complexity  Terminology  [From  5] 

The  algorithms  we  will  be  concerned  with  are  of  polynomial  and 
exponential  complexity.  The  difference  between  the  two  can  be  enormous. 
Polynomial  or  better  complexities  are  called  tractable,  because  it  is  assumed 
that  given  a  reasonably-sized  input,  the  algorithm  will  produce  an  answer  in  a 
reasonable  amount  of  time.  On  the  other  hand,  exponential  complexities  or 
worse  are  called  intractable.  This  is  because  an  extremely  large  amount  of  time 
is  usually  required  to  run  the  algorithm.  However,  a  polynomial  complexity 
algorithm  with  a  very  high  degree  might  take  longer  to  run  than  an  exponential 
complexity  algorithm  with  a  small  base. 

E.  PRIMALITY  TESTING 

We  now  turn  to  a  topic  of  critical  importance  in  our  analysis  of  the  man-in- 
the-middle  attack.  Suppose  a  large  integer  is  given.  How  might  we  quickly  be 
able  to  tell  if  the  number  is  prime  or  composite?  Mathematicians  have  studied 
this  question  for  millennia,  and  recently  this  question  has  become  even  more 
important  as  modern  computing  power  has  granted  the  ability  to  test  theories  on 
a  scale  that  was  at  one  point  inconceivable.  A  primality  test  is  an  algorithm  for 
determining  whether  an  input  number  is  prime.  Primality  tests  can  be  divided 
into  two  main  groups:  deterministic  and  probabilistic.  Deterministic  primality 
tests  prove  with  certainty  whether  a  number  is  prime  or  composite.  Probabilistic 
primality  tests  tell  us  a  number  is  composite  or  probably  prime.  If  a  probabilistic 
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method  returns  the  number  is  composite,  the  number  is  definitely  composite. 
However,  if  it  returns  the  number  as  prime,  there  is  a  controllably  small  chance 
the  number  is  actually  composite  [6], 

Primality  testing  is  currently  a  topic  of  great  interest  and  research  and  is, 
therefore,  very  dynamic.  We  provide  descriptions  of  several  deterministic  and 
probabilistic  algorithms  as  background  for  the  reader.  It  is  by  no  means  a 
comprehensive  discussion  of  every  algorithm  available.  Rather,  we  use  this 
section  as  a  way  to  motivate  our  choice  of  a  primality  test  for  later  on  when  we 
will  need  to  quickly  determine  if  a  given  number  is  prime. 


1.  Deterministic  Primality  Tests 
a.  Trial  Division 


The  simplest  primality  test  is  trial  division.  Trial  division  is  the 
method  of  sequentially  trying  test  divisors  into  a  number  n  so  as  to  partially  or 
completely  factor  n  [6],  We  start  with  the  first  prime  number,  2,  and  try  to  divide 
n  by  2.  If  2  divides  n,  we  know  n  is  composite  and  can  stop.  If  2  does  not 
divide  n  ,  we  try  the  next  prime  number,  3.  If  3  divides  n  ,  we  stop.  If  not,  we  try 
the  next  prime,  and  so  on.  When  we  reach  a  trial  divisor  that  is  greater  than  the 
square  root  of  n ,  we  may  stop.  If  no  prime  up  to  the  square  root  of  n  divides  n , 
then  we  declare  n  a  prime. 


This  test  is  quite  computationally  intensive.  Let  n(t)  be  the  prime 
counting  function,  which  counts  the  number  of  primes  <t .  Trial  division 


2  r 

requires  (in  the  worst  case)  about  n(47i) «  — —  divisions,  if  the  primes  to  4n 

v  '  In  n 


are 


\Tl 

stored  in  a  database,  or  even  —  divisions,  if  the  primes  are  not  stored  before 
the  test  starts. 
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b.  The  n-1  Test 

Trial  division  can  be  used  to  test  small  numbers  for  primality,  but  for 
larger  numbers  there  are  better  methods  [6],  The  n- 1  test  is  based  on  Fermat’s 
little  theorem,  and  suggests  that  we  try  to  factor  n- 1,  not  n.  In  1876,  E.  Lucas 
turned  Fermat’s  little  theorem  into  a  primality  test. 

Lucas’  Theorem  [6]:  If  a,n  are  integers  with  n>  1 ,  and  an~l  =  1  (modn) , 
but  a(”“1)/?  is  not  congruent  to  1,  modulo  n  for  every  prime  q\n-\,  then  n  is 
prime. 

The  most  difficult  step  in  implementing  the  Lucas  test  is  finding  the 
complete  factorization  of  n- 1.  Pocklington  strengthened  the  result  by  realizing  a 
partial  factorization  would  suffice  [6],  In  particular,  say 

n-\  =  FR  ,  and  the  complete  factorization  of  F  is  known.  (1) 

Pocklington’s  Theorem :  Suppose  (1)  holds  and  a""1  =1  (mod n)  and 
gcd(a("~1)/<?  -  l,n)  =  1  for  each  prime  q\F .  Then  every  prime  factor  of  n  is 
congruent  to  1  (modi7).  (2) 

Corollary  (n-1  testy.  If  (1 )  and  (2)  hold  and  F>Jn,  then  n  is  prime. 

Several  results  have  allowed  a  smaller  value  of  F .  These  include 
work  done  by  Brillhart,  Lehmer,  Selfridge,  Konyagin,  and  Pomerance  [6], 

The  Lucas  test  and  variations  of  it  have  a  running  time  of  about 
o((log/z)3 ) .  The  question  of  finding  the  “right”  base  still  remains. 

c.  Elliptic  Curve  Primality  Proving 

Elliptic  Curve  Primality  Proving  (ECPP)  is  a  class  of  algorithms  that 
provide  certificates  of  primality  using  sophisticated  results  from  the  theory  of 
elliptic  curves.  A  detailed  description  of  the  background,  theory,  and 
implementation  of  the  ECPP  can  be  found  in  Atkin  and  Morain  [7], 
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ECPP  is  the  fastest  known  general-purpose  primality-testing 
algorithm.  ECPP  has  a  running  time  of  o((log/z)4)  [7], 

d.  The  AKS  Test 

In  August  2002,  the  Agrawal-Kayal-Saxena  (AKS)  primality  test 
was  published  in  a  paper  titled  “Primes  is  in  P”  [8],  The  result  was  highly 
celebrated  because  of  the  four  properties  the  test  satisfies: 

1 )  It  can  be  used  to  verify  the  primality  of  any  given  number. 

2)  The  maximum  running  time  is  polynomial. 

3)  The  algorithm  is  deterministic,  not  probabilistic 

4)  The  algorithm  is  not  conditional  on  an  unproven  hypothesis. 

There  are  other  algorithms  that  satisfy  three  of  the  four  properties, 
but  AKS  is  the  only  known  test  to  satisfy  all  four. 

The  test  is  based  upon  the  equivalence 

(x  -  a)n  =  ( x "  -  a)  (mod  n) 

for  a  coprime  to  n ,  which  is  true  if  and  only  if  n  is  prime.  This  is  a  generalization 
of  Fermat’s  Little  Theorem  and  constitutes  a  primality  test  by  itself.  However,  the 
verification  of  primality  would  take  exponential  time,  and  thus,  requires 
improvement.  The  AKS  test  makes  use  of  a  related  equivalence 

(x-a)n  =  (xn -a)  (mod«,xr -1) . 

This  equivalence  can  be  checked  in  polynomial  time,  with  the  complexity  of  the 
original  algorithm  being  o((log/z)12) .  However,  recently  the  complexity  has  been 

brought  down  to  o((log/z)6)  [9], 
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2.  Probabilistic  Primality  Tests 

a.  Fermat  Primality  Test 

Based  on  Fermat’s  Little  Theorem,  the  Fermat  Primality  Test  is  a 
probabilistic  primality  test  that  is  the  basis  for  the  Miller-Rabin  primality  test  used 
later  on  in  the  thesis. 

Recall  that  by  Fermat’s  Little  Theorem,  if  p  is  prime  and  p  does 
not  divide  a,  then  ap~l  =1  (mod p) .  If  we  want  to  test  if  a  given  integer  n  is 
prime,  we  compute  a"  1  (modn)  for  several  values  of  a .  If  the  result  is  not  1  for 
some  value  of  a,  then  n  is  composite.  If  the  result  is  1  for  many  values  of  a, 
then  we  can  say  that  n  is  probably  prime. 

The  reason  we  can  only  say  probably  is  because  the  congruence 
a" _1=1  (modn)  may  hold  when  n  is  composite.  A  composite  number  n  is  a 
(Fermat)  pseudoprime,  if  the  congruence  an~l  =  1  (modn)  holds  [6], 
Unfortunately,  for  the  Fermat  Primality  Test,  there  are  infinitely  many  numbers 
that  the  test  would  call  probably  prime  even  if  every  value  of  a  was  computed  [6], 
These  numbers  are  the  so-called  Carmichael  numbers  and  give  us  reason  to 
look  for  a  test  that  will  only  give  pseudoprimes  for  a  fixed  fraction  of  the  bases 
attempted.  The  Miller-Rabin  test  accomplishes  this  goal. 

b.  Miller-Rabin  Primality  Test 

The  Miller-Rabin  Primality  Test  is  an  efficient  probabilistic  algorithm 
to  test  for  primality  based  on  the  idea  of  strong  pseudoprimes.  Consider  an  odd 
composite  number  n  and  n-\  =  d-2s  with  d  odd.  n  is  a  strong  pseudoprime  if 
either  ad  =  1  (modn)  or  ad'~  =- 1  (modn)  with  r  =  0,l,..*s-l .  The  Carmichael 
numbers  are  Fermat  pseudoprimes  for  every  base.  However,  a  composite 
number  can  only  be  a  strong  pseudoprime  to  at  most  one  quarter  of  all  bases  [6], 
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The  algorithm  is  as  follows: 

Choose  a  random  integer  ae[2,n-2],  If  ad  *  1  (mod«)  and 

ad'2'  *-l  (mod«)  for  all  r  =  0,1,...5-1 ,  then  a  is  called  a  witness  and  n  is 
composite.  Otherwise,  n  is  a  strong  probable  prime  to  base  a . 

If  n>  9  and  is  odd  composite,  the  probability  that  the  algorithm  will 
fail  to  produce  a  witness  for  n  is  <1/4.  The  probability  that  we  fail  to  find  a 
witness  after  k  iterations  is  <1/4*  [6],  We  can  make  this  probability  as  small  as 
we  desire  with  a  large  number  of  iterations.  For  instance,  if  we  wanted  to  ensure 
the  probability  of  calling  a  composite  number  a  prime  is  less  than  10  6 ,  we  must 
compute  10  iterations  or  more. 

As  an  example,  suppose  we  wanted  to  determine  if  the  number  341 
is  prime.  First  we  write  34 1  —  1  =  340  =  22  •  85 .  So  5  =  2  and  J  =  85.  We  randomly 
select  a  =  38  and  proceed  with: 

ad  mod  n  =  3  885  mod  341  =  56^1 

a2  d  mod n  =  3885  mod 341  =  56  ^  n-l 

a 2  d  mod n  =  3817u  mod 341  =  67  ^  n-l 

Since  none  of  the  congruences  hold,  we  know  341  is  composite.  In 
fact,  341  =  11-31.  However,  consider  n  =  703  and  a  =  3.  703-1  =  702  =  21 -351 . 

So  5  =  1  and  d  =  351 .  Continuing: 

ad  mod  n  =  3351  mod  703  =  702  *  1 

a 2  d  mod n  =  3351  mod 703  =  702  =  n-l 

By  the  second  congruence,  703  is  a  strong  pseudoprime  base3  .  If 
we  then  try  a  =  5  ,  we  get: 

ad  mod  n  =  5351  mod  703  =  438  *  1 

a2  d  mod  n  =  5351  mod  703  =  438  ^  n  - 1  _ 
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This  time  neither  congruence  holds,  and  we  know  703  is  a 
composite  number.  In  fact,  703  =  19-37. 


The  Miller-Rabin  test  is  very  fast  and  has  a  complexity  of 


0(  (logn)3) 


Of  course,  because  it  is  probabilistic,  there  is  a  chance  of  the  test 
returning  a  number  as  prime  when  it  is  in  fact  composite.  However,  as  will  be 
demonstrated  later,  we  are  very  concerned  with  the  speed  of  the  primality  test 
and  no  deterministic  test  will  run  fast  enough  for  our  purpose.  The  Miller-Rabin 
test  offers  us  both  speed,  as  compared  to  other  primality  tests,  and  the  ability  to 
control  the  probability  of  error  and  will  be  our  tool  of  choice. 
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III.  DIFFIE-HELLMAN  AND  THE  DISCRETE  LOGARITHM 


A.  THE  DIFFIE-HELLMAN  PROTOCOL 

“We  stand  today  on  the  brink  of  a  revolution  in  cryptography.”  This  was 
the  first  sentence  in  a  breakthrough  paper  published  in  1977  by  Whitfield  Diffie 
and  Martin  E.  Heilman.  In  the  paper,  titled  New  Directions  in  Cryptography  [10], 
the  authors  introduced  the  idea  of  public  key  cryptography  and  a  key  exchange 
protocol  that  was  named  in  their  honor.  The  Diffie-Hellman  protocol  provided  the 
first  practical  solution  to  the  key  distribution  problem,  allowing  two  parties,  never 
having  met  in  advance  or  shared  keying  material,  to  establish  a  shared  secret  by 
exchanging  messages  over  an  open  channel.  The  key  can  then  be  used  to 
encrypt  subsequent  communications  using  a  symmetric  key  cipher.  The  security 
rests  on  the  intractability  of  the  Diffie-Hellman  problem  and  the  related  problem  of 
computing  discrete  logarithms  [1],  We  will  call  the  two  parties  conducting  the  key 
exchange  “Alice”  and  “Bob.” 

Protocol  steps: 

1.  A  prime  number  p  and  generator  a  of  Z*p(2<a<p-2 )  are 
selected  and  published. 

2.  Alice  chooses  a  random  secret  x,\<x<p-2, and  sends  Bob 

a x  mod  p 
A—>  B :  ax  mod  p 

3.  Bob  chooses  a  random  secret  y,\<  y  <  p-2,  and  sends  Alice 

a'  mod p 
B  — »  A:  ay  mod  p 

4.  Bob  receives  ax  and  computes  the  shared  key  as  K  =  (, ax)y  mod  p 

5.  Alice  receives  ay  and  computes  the  shared  key  as  K  =  (ay)x  mod p 
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Because  (ay)x  =  (ax)y ,  Alice  and  Bob  have  arrived  at  the  same  secret 
key.  Only  x,  y ,  and  a **  are  kept  secret.  All  other  values  are  sent  in  the 
clear.  The  example  below  illustrates  the  procedure. 

1 .  Alice  and  Bob  agree  on  p  =  37  and  a  =  2. 

2.  Alice  chooses  x  =  \4  and  sends  Bob  30(=214mod37). 

A^B:  30 

3.  Bob  chooses  y  =  23  and  sends  Alice  5(=  223  mod37) . 

B->A:5 

4.  Bob  receives  30  and  computes  3023  mod 37  =  28 

5.  Alice  receives  5  and  computes  514  mod 37  =  28 

Alice  and  Bob  have  agreed  upon  28  as  their  secret  key. 

Figure  2  demonstrates  which  parties  know  what  information.  The  man-in- 
the-middle  will  be  called  Eve  from  here  on  out. 
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Obviously,  a  much  larger  value  of  p  is  required  than  used  in  the  example 
to  make  the  key  agreement  potentially  secure.  If  the  prime  number  37  was 
used,  Eve  could  simply  try  all  possible  values  of  2*vmod37.  Because  2  is  a 
primitive  root  modulo  37 ,  this  can  take  36  values.  A  key  space  with  only  36 
possibilities  can  be  exhausted  with  ease.  However,  if  the  prime  number  used  is 
large  enough,  no  computing  power  available  today  can  exhaust  the  key  space. 
For  instance,  most  applications  recommend  1024-bit  primes  [2],  This  correlates 
to  a  number  of  about  300  digits  and  makes  searching  the  key  space  one  by  one 
infeasible.  Table  4  demonstrates  how  long  it  would  take  a  modern  personal 
computer  (PC)  and  a  super-computer  (SC)  to  exhaust  various  sizes  of  key 
spaces.  We  assume  a  PC  can  search  approximately  one  million  (106)  keys  per 
second,  while  a  super-computer  can  search  approximately  one  trillion  (1012)  keys 
per  second. 


For  instance,  if  a  prime  of  64  bits  was  used,  it  would  correlate  to  a  base- 
ten  number  of  approximately  19  digits.  The  key  space  would  be  all  the  numbers 


,  which  would  be  on  the  order  of  1019  numbers.  Therefore,  a  PC  would 


take 


1019 

hT 


=  1013  seconds  to  completely  search  the  entire  key  space. 


Bits 

Digits  (approximate) 

PC  time 

(approximate) 

SC  time 

(approximate) 

64 

19 

317,098  years 

115  days 

128 

39 

3  x  10A(25)  years 

3  x  1 0A(1 9)  years 

256 

77 

3  x  10A(63)  years 

3  x  10A(57)  years 

512 

154 

3  x  10A(140)  years 

3  x  1 0A(1 34)  years 

1024 

308 

3  x  10A(294)  years 

3  x  1 0A(288)  years 

2048 

616 

3  x  10A(602)  years 

3  x  10A(596)  years 

Table  4.  Times  to  Exhaust  a  Key  Space 
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Considering  most  applications  use  prime  of  1024  bits  or  greater,  it  is 
obviously  infeasible  to  conduct  a  random  search  of  an  entire  key  space.  Of 
course,  one  could  get  lucky  and  the  key  could  be  one  of  the  first  numbers 
searched  by  the  computer.  However,  as  indicated  by  the  enormous  times  listed 
in  the  table,  it  is  more  likely  a  random  key  search  would  take  longer  than  most 
scientists  believe  the  universe  has  existed. 

B.  THE  DISCRETE  LOGARITHM 

Eve  has  more  information  than  just  the  fact  that  the  key  resides  in  the 
interval  Because  the  exchange  occurs  over  an  open  channel,  Eve 

knows  ax  and  ay  as  well.  If  j3  =  ax  (mod  p)  and  y  =  ay(yaoAp),  then  p,a,(3, 
and  y  are  known.  All  Eve  has  to  do  is  solve  ax  =  /3{modp)  for  x  or 
ay  =  /(mod p)  for  y .  Once  x  or  y  are  known,  Eve  simply  raises  ax  to  y  or  ay 
to  x  and  arrives  at  the  secret  key  K .  However,  if  p  is  large,  solving 
ax  =  /?(mod p)  for  x  in  general  is  considered  difficult.  The  problem  of  finding  x 
in  this  case  is  known  as  the  discrete  logarithm  problem  (DLP),  often 
abbreviated  x  =  La(j3). 

The  difficulty  of  solving  the  DLP  yields  useful  cryptosystems.  Diffie- 
Hellman  key  exchange  protocol,  El  Gamal  encryption  system,  and  the  Digital 
Signature  Algorithm  all  rely  on  the  difficulty  of  solving  the  DLP.  However,  not  all 
public-key  crypto  systems  rely  on  the  difficulty  of  the  DLP.  Another  number 
theory  problem  that  yields  cryptosystems  is  the  problem  of  factoring  large 
integers.  RSA,  considered  by  many  to  be  the  most  popular  public-key 
cryptography  algorithm,  relies  on  the  difficulty  of  factorization  for  its  security.  The 
size  of  the  largest  primes  for  which  discrete  logs  can  be  computed  has  usually 
been  approximately  the  same  size  as  the  size  of  largest  integers  that  could  be 
factored  [11].  In  2005,  a  168  digit  prime  (556  bits)  discrete  logarithm  was 
computed,  setting  a  record  at  that  time.  The  record  factorization  up  to  then  was 
200  digits  (663  bits). 
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As  discussed  above,  if  p  is  small,  it  is  easy  to  compute  discrete  logs  by 
exhaustive  search.  However,  when  is  p  large,  this  is  not  feasible.  We  will  now 
discuss  several  methods  of  attacking  the  DLP. 

1.  The  Pohlig-Hellman  Algorithm 

Pohlig  and  Heilman  introduced  the  following  algorithm  in  1978  to  solve 
discrete  logs  when  p- 1  has  only  small  prime  factors  [11],  [12], 

Suppose 

P~i= nv 


is  the  factorization  of  p- 1  into  primes.  Let  q  be  one  of  the  factors.  The  idea  is 
to  compute  x(modg')  for  each  q.‘  and  combine  them  using  the  Chinese 
Remainder  Theorem  to  find  the  discrete  logarithm. 

Thus,  x(modg')  is  found  by  writing  x  =  x0  +xxq  +  x2q2  +...  with  0  <  x(.  <  - 1  and 
determining  the  coefficients  xQ,xv...,xr_x  . 


p  —  i 

General  idea:  Starting  with  p  =  a\  raise  both  sides  to  the  — —  to  obtain 

q 

pip-iv*  _  a*(p-i )/?  _  (P-Dlq ^p-iyrt+w2*...  _  a*o(p-l)/9  (mod  p) 

To  find  x0 ,  simply  look  at  the  powers 

aHp-1)lq  (mod p) ,  k  =  0,l,2,...g-l, 
until  one  of  them  yields  p(p~l)lq .  Then  x0  =  k  . 

An  extension  of  this  idea  yields  the  remaining  coefficients.  Assume  q2  \  p  - 1 .  Let 

Px  =  J3a-X°  =  (mod  p) 


Raise  both  sides  to  the 


power  to  obtain 


p(p-\)!q2 


=  (x(-p~l^x,+X2q+"2>/q  =  {C(P-lyx2+x3ci+"  =  axi  (p-Wi 


(mod  p) . 
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To  find  Xj ,  simply  look  at  the  powers 

akip-V),q  (mod p) ,  k  =  0,l,2,...g-l, 

Until  one  of  them  yields  p^p~v>lql .  Then  x,  =  k . 

If  q3\p-l,  let  =  pxa~x'q ,  and  raise  both  sides  to  the  —  power  and  find  x2. 

<1 

We  can  continue  this  process  until  we  find  that  qr+l  does  not  divide  p- 1.  We 
have  then  determined  x0,xl...xr_1 ,  so  we  know  x  (mod<?r) . 

Repeat  this  procedure  for  all  prime  factors  of  p- 1 .  This  yields  x  (mod</) 
for  each  q;  and  we  combine  these  using  the  Chinese  Remainder  Theorem  to 
find  x  (mod p- 1) .  Since  0<x<  p- 1,  this  determines  x. 

As  an  example,  let  us  solve  2V  =3  (modlOl)  for  x. 

p- 1  =  100  =  22 -52  so  q  =  2,5 

First,  we  solve  2V  =  3  (mod22).  Letx  =  x0+2x1  (mod22).  Then 

pip-m  _  350  _  _j  (modl01)  and  aip-l)'2  =  250  =  -1  (mod  101) 

So  -l  =  (-l)v°  and  x0=l. 

Continuing,  px  =  J3a~x°  =3-2  1  =3-51  =  52  (mod  101).  So  =5225  =  1  (modlOl) 

and  ls(-l)'1.  Sox^O  and  x  =  1  +  2-0  =  1  (mod22). 

Next,  we  solve  2T  =  3  (mod52).  Letx  =  x0+5x!  (mod52).  Then 

p(P- D/s  _ 320  _ 84  (modl01)  and  a(p~X),i  =  l20  =95  (modlOl) 

We  make  a  list, 

95°  =  1; 951  =  95;952  =  36;953  =87;954  =84  (modlOl). 

Matching  with  the  list,  we  see  that  x0  =  4. 

Continuing,  we  get  /?,  =  pa~x°  =5-2 ^  =3-19  =  57  .  So  p^152  =574  =87  (modlOl). 
We  again  compare  with  the  above  list  and  see  that  953  =3  and  x,  =  3 .  This  leads 
tox  =  4  +  5-3  =  19  (mod52). 
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Now,  we  combine  x  =  l  (mod22)  and  x  =  \9  (mod52)  using  the  Chinese 
Remainder  Theorem  to  find  x  =  69  .  So  269  =  3  (modlOl) . 

It  is  well  known  that  the  time  complexity  of  the  Pohlig-Hellman  algorithm  is 
0(fp)  [11]. 

2.  Baby  Step,  Giant  Step 

Eve  is  trying  to  solve  ax  =  J3( mod p)  for  x.  The  following  algorithm  was 
developed  by  Daniel  Shanks  [11], 

First,  choose  an  integer  N  with  N2  >  p-\.  Next,  make  two  lists: 

1 .  aJ  mod  p  for  0  <  j  <  N 

2.  fia  Nk  mod  p  for  ()<k<  N 

Look  for  a  match  between  the  two  lists.  If  one  is  found,  then  aj  =  j3a  Nk , 
so  aJ+Nk  =  f5  .  Therefore,  x  =  j  +  Nk  and  the  discrete  logarithm  is  solved. 

The  complexity  of  the  baby  step,  giant  step  algorithm  is  also  O(-Jp) ,  but  it 
requires  storing  approximately  yjp  numbers  in  memory  and  is  therefore, 
impractical  for  very  large  primes,  such  as  102°or  larger  [1 1], 

3.  The  Index  Calculus 

Again,  Eve  is  trying  to  solve  ax  =  j3(modp)  for  x.  The  idea  in  the  index 
calculus  method  is  similar  to  the  quadratic  sieve  method  of  factoring  [1 1], 

The  first  step  is  a  precomputation  step  and  involves  picking  a  factor  base 
and  searching  for  a  set  of  r  linearly  independent  relations  between  the  factor 
base  and  the  powers  of  a .  Let  B  be  a  bound  and  let  pvp2,...,pm  be  the  primes 

less  than  B  .  This  is  our  factor  base.  We  then  compute  ak  (mod  p)  for  r  values 
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of  k .  For  each  number,  try  to  write  it  as  a  product  of  the  factor  base.  If  this  is  not 
the  case,  discard  ak .  However,  if  ak  =  Y\p“i  (mod  p) ,  then 

k  =  ^jaiLa(Pi)  (modp-l). 

When  we  obtain  enough  relations,  we  can  solve  for  La(pj)  for  each  i . 

Next,  for  random  integers  5 ,  compute  J3as  (mod  p) .  For  each  such 
number,  try  to  write  it  as  a  product  of  primes  less  than  B .  If  we  succeed,  we 
have  fias  =  ]^[ pbt‘  (mod  p) ,  which  means 

4  iP)  =  ~s  +  Z  b>La  (. Pi )  (mod  P  ~  0  ■ 

Using  this  algorithm,  any  p  over  200  digits  will  be  difficult  to  solve,  which 
makes  the  Index  Calculus  good  only  for  moderate-sized  primes  [11],  One  can 
show  that  the  time  complexity  of  the  Index  Calculus  is  o(ec(ln")'3(Cn ln"r’)  for  some 
c  >  0 ,  if  implemented  by  the  Number  Field  Sieve. 

C.  THE  DIFFIE-HELLMAN  PROBLEM 

We  described  how  solving  the  discrete  logarithm  easily  would  allow  Eve  to 
arrive  at  the  secret  key.  There  is  another  problem  Eve  can  solve  to  arrive  at  the 
secret  key — namely,  the  Diffie-Hellman  Problem.  The  Diffie-Hellman  Problem 
comes  in  two  flavors,  the  computational  and  the  decisional.  The  Computational 
Diffie-Hellman  Problem  is  defined  as  follows:  Let  p  be  a  prime  and  let  a  be  a 

primitive  root  mod  p  .  Given  a*  (mod  p)  and  ay{modp),  find  axy(mod p) .  Recall 

that  Eve  has  access  to  both  ax  and  ay  as  they  are  both  made  public  during  the 
exchange.  It  is  not  currently  known  whether  or  not  this  problem  is  easier  than 
computing  discrete  logs  [11],  A  related  problem,  known  as  the  Decisional 
Diffie-Hellman  Problem,  is  defined  as  follows:  Let  p  be  a  prime  and  let  a  be  a 

primitive  root  mod  p .  Given  a\modp)  and  ay{modp),  and  (3*  0  (mod/?), 
decide  whether  or  not  K  =  axy(modp)  [1 1],  In  other  words,  if  someone  offers  a 
number  to  Eve  and  claims  it  is  K ,  can  Eve  decide  whether  or  not  that  person  is 
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telling  the  truth  with  the  information  captured  in  the  open  channel?  Like  the 
computational  Diffie-Hellman  problem,  the  decisional  Diffie-Hellman  problem  has 
yet  to  be  solved.  It  is  unknown  whether  a  method  for  solving  the  decisional 
problem  will  lead  to  a  solution  for  the  computational  problem. 

The  methods  described  for  solving  discrete  logarithms  above  force 
applications  that  rely  on  the  difficultly  of  solving  discrete  logs  to  stay  away  from 
certain  primes.  Obviously,  the  larger  the  prime  used,  the  better.  Baby-step 
Giant-step  and  the  Index  Calculus  become  infeasible  to  use  when  primes  are 
larger  than  200  digits.  The  Pohlig-Hellman  algorithm  relies  on  the  factorization  of 
p- 1  to  consist  of  only  small  primes.  If  p  does  not  contain  only  small  primes, 
the  algorithm  becomes  inefficient.  Therefore,  the  primes  chosen  when  using  the 
Diffie-Hellman  protocol  should  contain  at  least  one  large  prime  in  the  factorization 
of  p- 1 .  This  situation  gives  rise  to  the  attack  we  will  focus  on.  If  p-\  contains 
a  very  large  prime,  such  that  p-l  =  Rq  with  q  prime  and  R  a  small  integer,  an 
unauthenticated  exchange  becomes  vulnerable  to  an  active  man-in-the-middle 
attack  that  we  will  discuss  next. 
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IV.  MAN-IN-THE-MIDDLE  ATTACK 


A.  THEORY  BEHIND  THE  ATTACK 

Wiener  and  van  Oorschot  [2]  noted  that,  if  certain  primes  are  used,  a 
potentially  fatal  protocol  attack  on  the  Diffie-Hellman  key  exchange  protocol 
becomes  possible.  The  idea  is  based  on  forcing  the  parties  to  agree  on  a  shared 
key  that  resides  in  a  subgroup  of  the  cyclic  group  Z*  .  If  the  order  of  the 

subgroup  is  small  enough,  an  adversary  can  exhaustively  search  the  subgroup, 
retrieve  the  secret  key,  and  eavesdrop  on  the  communication  of  Alice  and  Bob. 

For  instance,  consider  the  case  when  the  prime  used  for  the  key 
exchange  is  of  the  form  p  =  2q  + 1 ,  where  q  is  a  prime.  Then,  aq  =  a(p~X)jl . 

Claim:  a(p~l)'2  is  an  element  of  order  two. 

Proof:  By  Fermat’s  little  theorem,  ap~l  =1  mod  p  .  So  a(p~l)/ 2  must  be  +1  or 
-1.  But  if  a{p~l)n  =  1  then  a  must  have  order  (p- 1)  /  2  .  This  is  a  contradiction, 
because  a  is  a  primitive  root  of  Z*  and  must  be  of  order  p- 1 .  So  a(p~m  =-l 
and  is  an  element  of  order  two.  Q 

If  Alice  and  Bob  respectively  send  each  other  unauthenticated  messages 
ax  and  ay ,  an  active  intruder  may  substitute  ( ax)q  for  the  first,  and  (ay)qfor  the 
second.  When  Alice  receives  (ay)q  and  computes  ( aqy)x  and  when  Bob 
receives  (ax)q  and  computes  (aqx)y ,  they  will  arrive  at  only  one  of  two  possible 
values,  +1  and  -1.  The  intruder  can  then  try  both  possible  keys  and  gain  access 
to  Alice  and  Bob’s  secret  communications.  Obviously,  if  Alice  and  Bob 
demonstrate  vigilance,  they  will  agree  in  advance  to  suspect  any  key  agreement 
that  arrives  at  +1  or  -1 . 
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( p-l)/R  \m 

/  ~_1  \  /  D  . _ . 

Because  0<m<R-l,  this  is  in  our  original  list  and  CC  isoforderi?.  j_] 

So,  if  the  prime  Alice  and  Bob  agree  to  use  is  of  the  form  p  =  Rq  + 1,  Eve 
can  force  them  to  agree  on  a  key  in  a  subgroup  of  Z*p  of  order  R  by  replacing 

a'  and  ay  with  (ax)q  and  (ay)9 .  Even  if  Alice  and  Bob  are  vigilant,  the  key  can 
take  any  of  R  values  and  the  generalized  attack  poses  a  significant  threat  to  an 
unauthenticated  key  exchange  using  the  Diffie-Hellman  protocol. 


=  1  •  1*  ■  (a 
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B.  CREATING  THE  ENVIRONMENT 


Eve  must  force  Alice  and  Bob  into  a  subgroup  of  small  order  to  conduct 
this  attack.  Figure  3  represents  a  possible  algorithm  Eve  could  follow. 

NOTE:  Eve  only  needs  to  consider  cases  when  R  is  even,  because  if  R  is  odd, 

n  —  1  p  —  1 

— — ,  must  be  even  and  cannot  be  prime.  Also,  if  Eve  calculates  — — ,me  Z  as 

R  m 

p  —  1 

a  non-integer,  she  can  obviously  ignore  trying  any  number  of  the  form  - - ,ke  Z 

km 

because  it  will  also  not  be  an  integer. 
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Figure  3.  Attack  Algorithm 


The  most  important  step  in  creating  the  environment  to  conduct  the  attack 

is  searching  k  ,  k  =  2,A,...,R  ,  unfj|  we  fjn(j  a  prjme  yVe  cannot  continue  the 
attack  until  we  find  such  a  prime.  Obviously,  the  longer  Alice  and  Bob  are  kept 
waiting  for  return  correspondence,  the  more  suspicious  they  will  become  of 
possible  compromise  of  their  communication.  Therefore,  we  need  the  fastest 
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possible  method  to  detect  primality.  From  our  discussion  in  Chapter  II,  we  know 
a  probabilistic  primality  test  suits  us  best.  Specifically,  we  could  use  the  Miller- 
Rabin  primality  test  with  complexity  0((log«)3) . 


If  we  are  forced  to  search  the  entire  index  k  from  2  <  k  <  R ,  how  long 
might  this  take  us?  Recall  that  we  only  need  try  even  values  of  k,  and  in  the 
worst  case,  we  may  be  forced  to  try  all  R/2  even  numbers.  Therefore,  the  worst 
case  scenario  in  searching  for  a  prime  would  take 

]To((logA03) 

— - - - =  O  ( (R  /  2)  •  (log  Nf )  =  O  ( (log  TV)3 ) 

steps,  with  N  being  the  input  number  into  the  Miller-Rabin  primality  test.  Thus 
the  constant  value  in  the  Big-0  estimate  changes,  but  the  algorithm  remains 
bounded  by  the  time  it  takes  to  conduct  the  primality  tests. 


As  an  example,  suppose  Eve  was  listening  to  Alice  and  Bob  agree  upon 
the  prime  number  to  use  for  their  key  exchange  to  take  place  in  the  near  future. 
The  prime  number  they  choose  is  /?  =  10007  with  a  primitive  root  of  a  =  3.  Eve 
uses  the  attack  algorithm  in  Figure  4  to  attempt  to  force  Alice  and  Bob  to  agree  to 
a  key  in  a  subgroup  of  Z*0007  . 


First, 


P~ 1 
2 


10006 

2 


5003 


Next,  Eve  runs  5003  through  the  Miller-Rabin  primality  test  and  the  result 
is  prime. 

This  situation  represents  the  initial  case  described  above  with  the  prime 
number  being  of  the  form  p  =  2q  +  l.  Specifically,  10007  =  2-5003  +  1.  Next,  Eve 
must  intercept  the  number  Alice  attempts  to  send  to  Bob.  Suppose  Alice 
chooses  x  =  758  and  attempts  to  send  a' (3  75s  mod  10007  =  4865)  to  Bob. 


^4— :4865 
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Eve  intercepts  the  communication,  then  takes  <r'(4865)  and  raises  it  to  the 
q  power. 

(axY  =(3758)5003  mod  10007 

Meanwhile,  Eve  must  also  intercept  the  number  Bob  is  attempting  to  send 
to  Alice.  Suppose  Bob  chooses  v  =  555  and  attempts  to  send 

ay (3 555  mod  10007  =  1771)  to  Alice. 

B  ^  E :  1 77 1 

Eve  again  intercepts  the  communication,  and  takes  av(1771)  and  raises  it 
to  the  q  power. 

(ay)q  =(3555)5003  mod  10007 

Eve  then  sends  the  results  to  the  intended  recipients. 

£ 5  : 4865"  mod  10007 
E  ^  A  All \q  mod  10007 

Alice  and  Bob  then  both  finish  the  key  agreement  by  raising  the  received 
number  to  their  private  keys,  x  and  y  respectively,  and  arrive  at  the  same 
number,  the  “secret”  key. 

(ayq)x  =  (axqy 

As  a  result  of  the  theory  discussed  above,  without  any  knowledge  of  x  or 
y ,  Eve  knows  the  only  possible  keys  are  1  and  10006.  Eve  must  wait  for  a 
message  to  be  sent  between  Alice  and  Bob,  try  both  keys,  and  figure  out  which 
one  is  being  used.  She  can  then  eavesdrop,  and  Alice  and  Bob’s  secret 
communication  has  been  compromised. 

However,  as  mentioned  before,  any  vigilance  on  the  part  of  Alice  or  Bob 
would  cause  suspicion  if  the  key  agreed  upon  were  of  the  form  +1  or  -1 . 
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Now,  suppose  the  prime  number  Alice  and  Bob  agreed  upon  was 
p  =  19991  and  a  =  3.  Eve  must  again  search  for  a  large  prime  factor  of  p-l. 


First, 


p-\  19990 


=  9995 


Next,  Eve  would  run  9995  through  the  Miller-Rabin  primality  test. 
However,  because  it  ends  with  a  five,  five  must  be  a  factor  and  it  cannot  be  a 
prime  number. 

p-l  19990 

Continuing,  — —  = - =  4997.5  is  not  an  integer. 

4  4 


p-l  19990  —  . 

— —  = - =  3333.66  is  not  an  integer. 

6  6 

Because  was  not  an  integer,  we  skip  - 

p-l  19990 

il _ i  _  _  1999 

10  10 


Next,  Eve  runs  1999  through  the  Miller-Rabin  primality  test  and  the  result 
is  prime. 


Eve  has  found  a  large  prime  factor  of  p-l.  This  situation  resembles  the 
generalized  attack  with  a  prime  of  the  form  p  =  Rq  + 1;  in  this  case 
19991  =  10- 1999  +  1.  Intercepting,  altering,  and  retransmitting  the  messages  as 
she  did  above,  Eve  again  forces  Alice  and  Bob  into  a  subgroup  of  the  original 
cyclic  group.  This  time,  however,  there  are  ten  possibilities  for  the  “secret”  key. 
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The  cyclic  subgroup  of  Z*9991  generated  by  31999  is  of  order  ten  and  Alice 

and  Bob  can  only  arrive  at  ten  values  for  their  key.  Eve  must  wait  for  Alice  and 
Bob  to  communicate  with  their  new  key  and  see  which  of  the  ten  values  Alice 
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and  Bob  agreed  on.  Once  a  message  is  intercepted,  Eve  can  pull  it  offline, 
attempt  each  possible  key,  determine  the  key  they  agreed  upon,  and  listen  in  on 
Alice  and  Bob’s  communication. 


C.  PRIMES  OF  THE  FORM  Rq  + 1 

For  this  man-in-the-middle  attack  to  be  possible,  Alice  and  Bob  must 
agree  to  choose  a  prime  of  the  form  Rq  + 1.  How  likely  is  it,  assuming  Alice  and 
Bob  are  using  random  large  primes,  that  the  prime  they  choose  will  be  of  the 
correct  form?  To  answer  this  question,  we  must  first  count  the  number  of  primes 
p,  such  that  p  =  Rq  +  \.  We  can  begin  with  the  case  where  R  =  2.  This 
represents  the  original  case  in  the  man-in-the-middle  attack,  where  p  =  2q  +  \. 
These  particular  prime  numbers  have  their  own  name.  A  prime  p  is  a  so-called 
Sophie  Germaine  (SG)  prime  if  2p  +  \  is  also  prime.  If  we  let  nSG(t )  be  the 
number  of  SG  primes  not  exceeding  t,  it  can  be  demonstrated  that 


xSG(  0  =  0 


[13] 


Now,  considering  the  general  case,  if  we  fix  R  ,  then  the  number  of  primes  p<t 
of  the  form  p  =  Rq  + 1  is 


=  o( - - - ^ 

W(R)(\og(t  /  R)f 

where  is  Euler’s  Phi  function  [14],  However,  in  the  attack  R  can  range  from 
2  to  some  bound,  say  B.  Therefore,  we  must  sum  the  cases  from  R  =  2  to 
R  =  B.  The  number  of  primes  p  such  that  p  =  Rq  + 1  with  q  prime,  ranging  from 

2  <R<B  with  B  <  t1'2  is 


The  prime  number  theorem  states  that,  if  ;r(x)is  the  prime  counting 

7r(  X) 

function,  then  lim — =  1.  Roughly  speaking,  this  tells  us  that  if  you 

x  /  ln(x) 

randomly  select  a  number  close  to  a  large  number  N ,  the  odds  of  it  being  prime 

TC  fx ) 

are  about  l/ln(JV) .  By  the  prime  number  theorem,  it  follows  that  lim  —  =  0 . 

*->0°  n(x) 

If  we  let  nR  +l(t)  count  the  number  of  primes  of  the  form  p  =  Rq  + 1  not  exceeding 

nKa, ,  (x) 

t,  it  follows  that  lim — - - =  0  as  well.  This  tells  us  that,  as  x  gets  very  large, 

*->°°  71  (x) 

the  likelihood  that  a  random  prime  number  is  a  Sophie  Germaine  Prime  or  any 
prime  of  the  form  Rq  +  l  is  increasingly  unlikely. 

Using  the  prime  number  theorem  and  Big-0  estimates  above  with  a 
constant  value  of  one,  we  can  approximate  the  numbers  of  primes  of  different 
forms.  Table  5  lists  these  approximations  using  scientific  notation.  The  R  value 
corresponds  to  different  values  for  primes  of  the  form  p  =  Rq  + 1 .  The  ratios  listed 
are:  (primes  of  the  given  form)  /  (total  primes). 
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0-64  bits 

64-128  bits 

128-256  bits 

Total  Primes 

4.1583e17 

3.8353e36 

6.5255e74 

R=2 (S.G) 

9.3737e15 
ratio:  .0225 

4.3228e34 
ratio:  .0113 

3.6775e72 
ratio:  .0056 

R=100 

4.316e16 
ratio:  .1038 

1.9907e35 
ratio:  .0519 

1.6935e73 
ratio:  .0260 

R=10A4 

8.6335e16 
ratio:  .2076 

3.981 5e35 
ratio:  .1038 

3.3871e73 
ratio:  .0519 

R=10A6 

1.295e17 
ratio:  .3114 

5.9722e35 
ratio:  .1557 

5.0806e73 
ratio:  .0779 

256-512  bits 

512-1024  bits 

1024-2048  bits 

Total  Primes 

3.778e151 

2.5327e305 

2.2765e613 

R=2  (S.G.) 

1.0646e149 
ratio:  .0028 

3.5683e302 
ratio:  .0014 

1.6037e610 
ratio:  .0007 

R=100 

4.9024e149 
ratio:  .0130 

1.6433e303 
ratio:  .0065 

7.3853e610 
ratio:  .0032 

R=10A4 

9.8049e149 
ratio:  .0260 

3.2865e303 
ratio:  .0130 

1.477e611 
ratio:  .0065 

R=10A6 

1 .4707e1 50 
ratio:  .0389 

4.9298e303 
ratio:  .0195 

2.21 56e61 1 
ratio:  .0097 

Table  5.  Prime  Number  Approximations 

The  approximations  demonstrate  the  increasing  unlikelihood  of  a  random 
prime  being  of  the  form  p  =  Rq  + 1.  Using  our  approximations,  around  64  bits 
over  30%  of  all  primes  match  the  form  with  a  bound  of  10A6.  However,  when  we 
consider  primes  around  2048  bits,  the  percentage  drops  below  one.  If  we 
increase  the  bound  we  can  increase  the  likelihood,  but  increasing  the  bound 
forces  the  attacker  to  search  through  more  keys  to  find  the  correct  one. 

D.  COUNTERMEASURES  AGAINST  THE  ATTACK 

To  prevent  this  potentially  fatal  protocol  attack,  Alice  and  Bob  have 
several  options.  The  easiest  method  is  to  force  authentication  prior  to  the  key 
exchange.  Another  method  that  prevents  the  attack  is  based  on  creating  a  prime 
order  subgroup  before  the  key  exchange  takes  place. 
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1. 


Authentication 


The  attack  we  have  discussed  is  not  the  only  man-in-the-middle  attack 
Diffie-Hellman  is  vulnerable  to.  The  Appendix  details  another  attack,  if  no 
authentication  occurs  prior  to  the  key  exchange.  To  combat  these  attacks,  a 
variation  of  Diffie-Hellman  that  ensures  authentication  can  be  used.  An  example 
of  such  a  variation  is  the  Station-to-Station  protocol  (STS).  STS  is  a  three-pass 
variation  of  the  basic  Diffie-Hellman  protocol  that  allows  the  establishment  of  a 
shared  secret  key  between  two  parties  with  mutual  entity  authentication  and 
mutual  explicit  key  authentication  [1],  The  STS  employs  digital  signatures.  A 
digital  signature  of  a  message  is  a  number  dependent  on  some  secret  known 
only  to  the  signer;  and,  additionally,  on  the  content  of  the  message  being  signed 
[1],  The  STS  protocol  is  frequently  employed  with  the  RSA  signature  scheme. 

To  employ  an  RSA  signature  scheme,  public  and  private  key  pairs  must 
first  be  generated. 

RSA  signature  scheme  key  generation  steps  [1]: 

1 .  Generate  two  large  distinct  random  primes  p  and  q ,  each 
roughly  the  same  size 

2.  Compute  n  =  pq  and  <f>  =  (p -\)(q 

3.  Select  a  random  integer  e,\<e<tf>,  such  that  gcd(e,<f> )  =  1 

4.  Use  the  extended  Euclidean  algorithm  to  compute  the  unique 
integer  d,\<d<(j>  such  that  ed  =  1  (mod ^5) 

5.  The  user’s  public  key  is  (n,e)  and  the  user’s  private  key  is  d 
NOTE:  Each  user  should  generate  a  public  and  private  key 

Now,  if  a  user  Alice  wants  to  sign  a  message  m  ,  and  a  user  Bob  wants  to 
verify  the  message  signature,  the  remaining  steps  of  the  protocol  must  be 
completed. 
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RSA  signature  scheme  protocol  steps  [1] 


1 .  Signature  generation 

a.  Compute  m  =  R(m) ,  an  integer  in  the  range  [0,«  -1] 

b.  Compute  s  =  mdmodn 

c.  Alice’s  signature  for  m  is  5  . 

2.  Signature  verification 

a.  Obtain  Alice’s  authentic  public  key  (n,e) 

b.  Compute  m  =  se  mod/7 

c.  Recover  m  =  R1(m) 

With  the  knowledge  of  a  digital  signature  scheme,  in  particular  RSA,  we 
can  move  onto  the  STS  protocol.  If  we  let  E  denote  a  symmetric  encryption 
algorithm,  and  SA(m)  denote  Alice’s  signature  on  m ,  the  protocol  is  as  follows 
[1]: 

1.  Setup 

a.  A  prime  number  p  and  generator  a  of  Z*(2 <a< p- 2)  are 
selected  and  published 

b.  Alice  selects  RSA  public  and  private  signature  keys  ( nA,eA ), 
and  dA  (Bob  selects  analogous  keys).  Assume  each  party 
has  access  to  authentic  copies  of  the  other’s  public  key. 

2.  Actions 

a.  Alice  generates  a  secret  random  x,\<x<  p-2  and  sends  to 
Bob  ax  mod  p . 

A^B:  a  x  mod  p  (message  1 ) 
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b.  Bob  generates  a  secret  random  y,\<y<  p-2,  and 
computes  the  shared  key  k  =  (ax)y mod/?.  Bob  signs  the 
concatenation  of  both  exponentials,  encrypts  this  using  the 
computed  key,  and  sends  to  Alice. 

B  -»  A:  ay  mod  p,Ek{SB(ay  ,ax))  (message  2) 

c.  Alice  computes  the  shared  key  k  =  ( ay)x  mod p ,  decrypts  the 
encrypted  data,  and  uses  Bob’s  public  key  to  verify  the 
received  value  as  the  signature  on  the  hash  of  the  cleartext 
exponential  received  and  the  exponential  sent  in  message  1 . 
Upon  successful  verification,  Alice  accepts  that  k  is  actually 
shared  with  Bob,  and  sends  Bob  an  analogous  message. 

A->  B  :Ek(SA(ax,ay))  (message  3) 

d.  Bob  similarly  decrypts  the  received  message  and  verifies 
Alice’s  signature  therein.  If  successful,  Bob  accepts  that  k 
is  actually  shared  with  Alice. 

The  exchanged  exponentials  are  digitally  signed  and  retransmitted  during 
the  STS  protocol.  Therefore,  Eve  cannot  alter  the  original  exponentials  without 
triggering  a  failure  during  Alice  and  Bob’s  key  agreement.  This  precludes  the 
man-in-the-middle  attack  we  have  focused  on  and  defends  Alice  and  Bob’s  key 
exchange  against  several  other  possible  active  man-in-the-middle  attacks. 

2.  Prime  Order  Subgroups 

Van  Oorschot  and  Wiener  [2]  noticed  the  potentially  fatal  man-in-the- 
middle  attack  and  reasoned  that  restricting  computations  to  prime-order 
subgroups  would  prevent  the  attack.  In  this  case,  we  will  force  the  prime  number 
p  that  defines  the  environment  to  be  of  the  form  p  =  Rq  + 1,  where  R  is  a  small 


43 


integer  and  q  is  a  large  prime.  Now,  instead  of  using  a  generator  a  of  Z*  as 
our  base  for  exponentiation,  we  compute  g  =  aip ~1)/?  and  let  g  be  our  new  base. 

Claim:  The  element  g  generates  a  subgroup  of  order  q . 

Proof:  Suppose  g  is  of  order  k<q  and  so  gk  =  1.  Then  aip~lHlq  =  1.  But 
k/q<  1  and  so  (p  -\)-k  /  q  <(p-\) .  This  means  a  is  of  order  <{p- 1),  a 
contradiction  because  a  is  a  generator  of  Z*  .  Therefore,  g  must  be  of  order 
>q.  But  gq  =a{p~l)q/q  =  a{p~l)  =1,  so  g  is  of  order  q  and  (g)  is  an  subgroup  of 
order  q .  [] 

By  using  g  instead  of  a  to  conduct  the  key  exchange,  Alice  and  Bob  are 
working  in  a  prime  order  subgroup  instead  of  a  group  of  order  p- 1 .  The  man-in- 
the-middle  attack  we  have  discussed  is  based  on  forcing  the  parties  into  a 
subgroup  of  small  order  and  exhaustively  searching  the  smaller  key  space. 
However,  by  Lagrange’s  theorem,  the  order  of  any  subgroup  must  divide  the 
order  of  the  group.  The  order  of  the  group  generated  by  g  is  q  .  Therefore,  any 
subgroup  must  be  of  order  q  or  1 ,  because  those  are  the  only  divisors  of  q . 
Thus,  the  prime  order  subgroup  cannot  be  divided  any  further  and  this  man-in- 
the-middle  attack  becomes  infeasible. 

The  Internet  Engineering  Task  Force  (IETF)  has  adopted  the  prime  order 
subgroup  tactic  to  prevent  the  type  of  attack  we  have  focused  on.  In  particular, 
Request  for  Comment  (RFC)  2631  standardizes  the  technique  for  a  particular 
Diffie-Hellman  variant,  based  on  the  American  National  Standards  Institute  x9.42 
draft  [15], 

E.  EXTENDING  THE  ATTACK  TO  THE  N-PARTY  SETTING 

The  Diffie-Hellman  protocol  we  have  discussed  so  far  has  been  limited  to 
two  parties.  However,  protocols  have  been  created  that  extend  the  key 
agreement  to  group  communications.  Steiner,  Tsudik  and  Wainer  [16]  defined  a 
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class  of  natural  extensions  of  Diffie-Hellman  to  the  n-party  setting.  These 
protocols,  without  the  countermeasures  discussed  above,  are  vulnerable  to  the 
man-in-the-middle  attack  we  have  focused  on.  We  now  move  to  demonstrate  the 
attack  on  two  of  the  protocols  the  authors  describe.  First,  we  consider  the 
protocol  the  authors  name  Group  Diffie-Hellman  version  1  (GDH.1).  In  this 
section,  to  keep  with  the  original  notation  of  [16],  we  use  set  notation  to  mean  an 
ordered  tuple. 

We  call  the  participants  of  the  n-party  key  exchange  {MvM2,...,Mn} .  As 
in  the  two-party  case,  a  prime  number  p  and  a  generator  a  of  the  group  Z*  are 
selected  and  published.  Each  member  Mi  chooses  a  random  secret  number 
s,,0<  si<p-2.  The  protocol  consists  of  two  stages;  upflow  and  downflow. 

In  the  upflow  stage,  each  member  makes  their  contribution  to  the  shared 
key.  A  member  M.  receives  a  collection  of  intermediate  values,  and  has  the  task 

of  raising  the  last  in  the  list  of  incoming  intermediate  values  to  the  power  of  si . 
Then  Mi  appends  the  result  to  the  incoming  set  of  values  and  forwards  all  to 
Mm.  As  an  example,  M3  would  receive  ja\aSlS2j  from  M2.  M3  would  then 

compute  as'SlH ,  append  the  result  to  the  incoming  message  to  create 

{a\flSA,aW)}  and  forward  to  M4. 

The  upflow  stage  is  completed  when  Mn  calculates  as'Sl  ’Sn ,  which  is  the 
intended  group  key,  Kn .  Once  Mn  has  obtained  Kn ,  the  downflow  stage  is 
initiated.  Each  member  M.  receives  i  messages,  one  to  compute  Kn  and  /-I 
to  send  to  M._x.  For  example,  if  n  =  4,  M3  would  receive  {as,,a*lU,aSlS2Sij  from 
M4.  First,  M3  would  use  the  last  value  to  compute  Kn=as'Wi.  Then,  the 
remaining  values  would  be  raised  to  s3  and  |aV3,aW3|  would  be  sent  to  M2. 
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M2  would  repeat  the  procedure,  and  would  send  |aW2|  to  Mx.  The  downflow 

stage  is  then  completed  when  Mx  computes  Kn  =a¥!¥4.  GDH.1  is  depicted  in 
Figure  4. 


The  active  adversary,  Eve,  wishes  to  attack  the  key  agreement  forcing  the 
n-party  to  agree  on  a  key  in  a  small  subgroup  of  Z* .  Like  in  the  two-party  case,  if 
possible  Eve  must  first  break  the  prime  number  p  down  into  the  form  p  =  Rq  + 1 
with  q  a  large  prime  and  R  a  small  integer.  Once  completed,  Eve  must  then 
intercept  and  alter  two  messages  to  complete  the  attack.  The  first  message  she 
must  intercept  is  the  first  message  sent,  that  is, 

Mx^M2:  as 1 . 

With  aSl  captured,  Eve  computes  (as')q  =aqh  and  proceeds  to  send  the 
computed  number  as  the  message  onto  M2.  M2  computes  aqs,Sl  and  sends 
\aqs'  ,aqs,S2^  to  M3.  This  continues  until  the  end  of  the  upflow  stage,  when  Mn 

computes  aqs'"~s"  =  Kn .  Eve  has  forced  Kn  to  be  one  of  R  values,  based  on  the 
theory  of  the  attack  described  earlier  in  the  chapter. 
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Next,  Eve  must  intercept  the  first  message  sent  during  the  downflow 
stage.  If  «  =  4,  then 

M4^M3:[as\aqs's\am}. 

NOTE:  Because  of  the  alteration  Eve  completed  in  the  upflow  stage,  only  the 
first  part  of  the  message  must  be  altered. 

Eve  simply  computes  aqSi ,  replaces  the  first  number  with  the  computation, 
and  forwards  the  message  to  M3.  The  participants  all  arrive  at  Kn  =  aqs'Wi ,  and 

the  key  exchange  has  been  successfully  attacked.  However,  in  this  case  Eve 
had  to  capture  and  alter  two  very  specific  messages  for  the  attack  to  be 
successful.  In  the  next  protocol,  Eve  has  more  flexibility. 

Next,  we  turn  our  attention  to  Group  Diffie  Heilman  version  3  (GDH.3). 
GDH.3  reduces  the  amount  of  computation  each  party  (except  for  Mn)  must 

complete,  which  may  be  very  beneficial  if  the  group  size  is  large.  The  protocol 
consists  of  four  stages.  The  first  stage  is  similar  to  the  upflow  stage  of  GDH.1  in 
which  every  member  contributes  to  the  key.  However,  after  processing  the 
upflow  message,  Mn  ,  broadcasts  aSlS2"'Sn-'  to  the  entire  group  as  the  second 
stage  of  the  process.  In  stage  three,  each  M. ,  except  Mn  ,  factors  out  their 
contribution  (aSi)  from  the  broadcasted  value  and  forwards  the  result  to  Mn. 
After  Mn  collects  all  the  values  from  the  group,  in  the  last  stage  Mn  raises  each 

value  to  sn  and  returns  the  values  to  the  group.  Now  each  M.  has  k€:['-nlk*1 
and  simple  raises  this  value  to  si  to  compute  Kn . 

For  example,  if  77  =  5,  the  upflow  stage  completes  when  M4  computes 

aw3*4  Then,  in  stage  2,  this  value  is  broadcasted  to  the  entire  group.  In  stage 
3,  each  member  other  than  M5  factors  out  their  contribution  and  forwards  the 

result  to  M5  (i.e.  M2  would  send  aW4 ).  In  stage  4,  Ms  raises  each  received 
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value  to  s5  and  returns  the  value  to  the  sender  (i.e.  M2  would  receive  as'Ws . 
Lastly,  each  member  raises  the  received  value  to  their  secret  number  and  arrives 
at  Kn .  Figure  5  depicts  GDH.3. 
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Figure  5. 

GDH.3  [From  16] 

It  is  much  easier  for  Eve  to  attack  GDH.3  than  GDH.1.  She  needs  only  to 
intercept  and  alter  one  message,  and  she  can  choose  any  of  the  first  i-2 
messages  sent  in  the  group.  By  raising  any  one  of  these  messages  to  q ,  Mn_x 

will  inevitably  broadcast  aqs'Sl'"s-'  to  the  group.  At  this  point,  each  member  factors 
out  their  contribution,  and  forwards  the  result  to  Mn  leaving  q  in  the  exponent  of 

each  message  sent.  Mn  simply  raises  each  message  to  sn  and  returns  each 
message.  Therefore,  q  is  undisturbed,  each  member  arrives  at  the  same  key 

Kn  =  aqs,S2'"s" ,  and  Eve  has  successfully  forced  the  group  into  a  small  number  of 
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possible  values  for  the  key.  However,  as  mentioned  above,  if  the  parties  agree 
to  use  either  authentication  or  prime  order  subgroups  during  the  key  exchange, 
attacks  of  this  sort  are  prevented. 
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V.  RESULTS  AND  FUTURE  WORK 


This  thesis  investigated  and  analyzed  a  particular  man-in-the-middle 
attack  on  the  Diffie-Hellman  key  exchange  protocol.  We  created  an  algorithm  to 
carry  out  the  attack  and  demonstrated  how  it  is  constrained  by  the  primality  test 
used  by  the  attacker.  In  particular,  if  the  Miller-Rabin  primalty  test  is  used,  the 

algorithm’s  complexity  is  )  wjth  N  being  the  input  prime  number.  We 

showed  that  prime  numbers  of  the  form  P  =  R<1  +  1  with  R  bounded  are  common 
with  small  primes  but  become  increasingly  rare  as  larger  numbers  are 
considered.  In  fact,  with  low  bit  primes  such  as  128  bits,  a  reasonably-sized  R 
will  give  an  attacker  a  good  chance  of  the  prime  being  of  the  desired  form. 
However,  when  large  primes  such  as  1024  and  2048  bits  are  considered,  a  very 
large  value  of  R  is  required  to  give  an  attacker  a  reasonable  chance  of 
conducting  the  attack.  We  demonstrated  how  two  techniques,  authentication  and 
prime  order  subgroups,  can  prevent  the  attack.  In  fact,  it  appears  industry  has 
begun  to  adopt  the  prime  order  subgroup  technique  to  defend  against  the  attack. 
Finally,  we  demonstrated  how  the  attack  can  be  expanded  to  include  a  class  of 
multi-party  Diffie-Hellman  variants. 

Possible  future  efforts  include  coding  and  implementing  the  man-in-the- 
middle  attack  on  active  communications  to  test  the  theory  laid  out  in  this  thesis. 
It  is  possible  that  analyzing  the  given  prime  number,  capturing  the  required 
messages,  altering  those  messages,  and  forwarding  the  messages  to  the 
intended  recipients  will  be  too  time-consuming.  This  would  obviously  alert  the 
parties  of  possible  compromise.  In  addition,  it  may  be  possible  to  alter  the  attack 
to  compromise  communications  that  are  authenticated  and  render  several  Diffie- 
Hellman  variants  such  as  the  STS  protocol  vulnerable.  Other  future  work  may 
include  an  attempt  to  defeat  the  prime  order  subgroup  technique. 
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APPENDIX:  ANOTHER  MAN-IN-THE-MIDDLE  ATTACK 


This  appendix  details  a  possible  man-in-the-middle  attack  on  the  Diffie- 
Hellman  key  exchange  protocol,  if  no  prior  authentication  occurs  [17]. 

1)  Alice  sends  her  public  key  to  Bob,  but  Eve  intercepts  it,  and  Bob 
never  receives  the  key. 

2)  Eve  spoofs  Alice’s  identity  and  sends  over  her  public  key  to  Bob. 
Bob  now  thinks  that  he  has  Alice’s  public  key. 

3)  Bob  sends  his  public  key  to  Alice,  but  Eve  intercepts  it,  and  Alice 
never  receives  the  key. 

4)  Eve  spoofs  Bob’s  identity  and  sends  over  her  public  key  to  Alice. 
Alice  now  thinks  that  she  has  Bob’s  public  key. 

5)  Alice  combines  her  private  key  and  Eve’s  public  key  and  creates 
symmetric  key  SI . 

6)  Eve  combines  her  private  key  and  Alice’s  public  key  and  creates 
symmetric  key  SI . 

7)  Bob  combines  his  private  key  and  Eve’s  public  key  and  creates 
symmetric  key  S2. 

8)  Eve  combines  her  private  key  and  Bob’s  public  key  and  creates 
symmetric  key  S2. 

9)  At  this  point,  Alice  and  Eve  share  a  symmetric  key  (SI)  and  Bob 
and  Eve  share  a  different  symmetric  key  (S2).  Alice  and  Bob  think 
they  are  sharing  a  key  between  themselves  and  do  not  realize  that 
Eve  is  involved. 

10)  Alice  writes  a  message  to  Bob,  uses  her  symmetric  key  (SI)  to 
encrypt  the  message,  and  sends  it. 
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1 1 )  Eve  intercepts  the  message  and  decrypts  it  with  the  symmetric  key 
SI,  reads  or  modifies  the  message  and  re-encrypts  it  with 
symmetric  key  S2,  and  sends  it  to  Bob. 

12)  Bob  takes  symmetric  key  S2  and  uses  it  to  decrypt  and  read  the 
message. 

Figure  6  illustrates  the  attack  [17]. 


Figure  6.  Another  Man-in-the-Middle  Attack 
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